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Abstract 

Slovenia's Current Research Information System (SICRIS) currently hosts 86,443 publications with citation data from 
8,359 researchers working on the whole plethora of social and natural sciences from 1970 till present. Using these 
data, we show that the citation distributions derived from individual publications have Zipfian properties in that they 
can be fitted by a power law P{x) ~ jc with a between 2.4 and 3.1 depending on the institution and field of research. 
Distributions of indexes that quantify the success of researchers rather than individual publications, on the other hand, 
cannot be associated with a power law. We find that for Egghe's g-index and Hirsch's h-index the log-normal form 
P(x) ~ exp[-fllnji: - b(lnx)^] applies best, with a and b depending moderately on the underlying set of researchers. 
In special cases, particularly for institutions with a strongly hierarchical constitution and research fields with high 
self-citation rates, exponential distributions can be observed as well. Both indexes yield distributions with equivalent 
statistical properties, which is a strong indicator for their consistency and logical connectedness. At the same time, 
differences in the assessment of citation histories of individual researchers strengthen their importance for properly 
evaluating the quality and impact of scientific output. 
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1. Introduction 



Raking of researchers is both important as well as interesting. While importance is largely due to the determi- 
nation of advancement and selection criteria that underly faculty recruitments or the awarding of rese arch grants and 
funds to individuals with best indicators ( Garfield . 1983: Adam. 2QQ2; Ventura and Mombr uL 20061) . the fact that it 
is interesting has many more aspects worth considering. For one, researchers seem to have a keen interest for de- 
termining who is the most cited or the most connected or the most influential of them all. Certainly this in part to 
gratify the personal sense of achievement, but more intricately, there is a lot we don't yet understand in terms of 
how and why certain researchers get more attention than others, and why some cannot rise above a given level of 
recognition. Scientific excellence is definitely a crucial factor to consider, yet that alone cannot ex plain all the fasci- 
nating properties that have been revealed in recent years with regards to c i tation distributions (Egghe and Rousseau , 
1990riLaherrere and SornettS. Il998t iRedneR 1 19981 120051: iRadicchi et all 120081: IVieira and Gomesl l2010h. i ndexes 



that quantify individual scientific output ('Hirsch', '2005'; 'Eggh^ |2006l |2^8a|; LBornmann et al.', '20081 IZhangl 120091: 



Guns and Rousseau. 2009: Cabrerizo^ et al., 2010), the importance of first-movers (Newman, 2009} and sel f-citations 



( Fowler and Aksnesi 120071: LSchreibeilboOTl l2008ah . or the structure of scientific collaboration networks (iNewmanL 
to name but a few. 

Empirical studies are important since they provide fuel for potential attempts at modeling and related theoretical 
approaches aimed towards deepening our understanding of citation practices, as well as for sharpening criteria and 
indexes that quantify individual scientific output. Notably, one fact stands quite solid and has been pointed out on 
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several occasions [see e. s. iRednen ( 12005^ 1 . Namely that the more one paper is cited, the more likely it is it w ill attract 
furt her citations in the future. This phenomenon is by now known under different names. T he Matthew effect ( MertonL 
is likely the oldest to describe it, but one can com e across also cumulative advantage (Ide SoUa PriceL[l965[ll976h 
or preferential attachment (iBarabasi and Albert , Il999h . depending on the field of research and motivation of the study. 
Especially linear pref erential attachment models enjoy exceptional popular i ty in describing the growth and setup 
of co mplex networks dAlbert and Barabasil l2002t iDorogovtsev and Mendesl 120031 : iPastor-Satorras and Vespignani , 



20041) and have becorne synonymous for power-law distribiitions of connectiori s that can be observed in many of 



them (Falout sos et al. . 19991 : Sornette . 2003 : NewmanL 2005 : Clauset et al. . 2009 ^. There is evidence suggesting that 
citation statistics ma y obey to simi lar rules, yet deviations from the power-law distribution maintain the reasoning 
open to amendments (IRednen. l2005h . especially in the sense of sublinear or near-linear preferential attachment, which 



^^—-^^—^ " — h i^.:^, ^^t-^-^^-J — — r — " -p-L^ — 

is know to yield stretched exponential or log-normal forms ([ Krapivskv et aljj_ 2000t IDorogovtsev and Mendesl 120001 : 



Dorogovtsev et al. ■ l2000l: l Krapivskv et al.Ll2001l:lKrapivskv and Redner. 2001h . 



Here we present the analysis of 40 years of Slovenia's research output across the whole of social and n atural 
scien ces in search for signs of self-organization and laws that underly many aspects of our existence. Zipf 's law (^ 
1949h in particular is related to the frequent occurrence of power-law distributions, with examples ranging from the 



frequency of wor ds in a given language, income rankings, population counts of cities to avalanche and forest-fire sizes 
([Newman, We show that the citation distributions derived from individual publications, i.e. determined as the 

number of publications with a certain number of citations, are of power-law type, which indeed seems to confirm the 
assumption of linear preferential attachment underlying their accumulation. However, by taking into consideration 
not individual publications but rather individual r esearchers, we find that the p ower-law distributions give way to 
log-no rmal, and in s pecial cases also exponential |Laherrere and Sornettelll998D . distributions. Notably, both the g- 
index dEgghe , 2006 ) and the h-index ( Hirsch , 2005 ), as well as the total citation count per researcher, show equivalent 
statistical properties in terms of their distributions. This suggests that these measures share a relatively high degree 
of logical connectedness that cannot be distinguished on large scales. However, differences between them can be 
crucial for the ranking of individual researchers within specific groups or fields of research. Since log-normal forms 
are typically associated with random multiplicative processes, the assumption of liner preferential attachment as the 
main driving force behind the citation record of an individual researcher seems no longer valid. Certainly it plays a 
role, but the "personality" of a researcher brings with it additional factors that require a different interpretation. An 
important role seems to play the fact that all researchers more or less frequently publish papers that don't receive a lot 
of attention. At the same time, a researcher can gather a considerable number of citations even if s/he doesn't publish a 
single highly-cited paper Altogether, these considerations, which are absent when considering individual publications 
as reference points, amount to an override of the power-law distribution. We also point out that, as discovered already 
bv lRedneii (11998) , not a single function can describe the examined distributions over the whole range of values. Power 
laws emerge due to collective effects, synonymous to preferential attachment, which apply to well-cited publications 
only. Papers that are not cited frequently do not benefit from such or similar effects and are forgotten soon after their 
publication. Presented results thus fit well to known facts, as well as provide a cohesive overview of factors that affect 
the distributions of citations and other measures of scientific output. 

The paper is structured as follows. In the next section we provide basic facts about Slovenia and the analyzed data 
set. We also review basic properties of Zipf plots, power-law and log-normal distributions, which will be called upon 
when presenting the main results in section 3. In the last section we summarize our findings and briefly discuss their 
implications for the national selection criteria currently employed by the Slovenian Research Agency. 



2. Preliminaries 

Slovenia is a small country located at the heart of Europe with a population of two milUon0 It has a very well- 
documented research history, which is made possible by SICRIS - Slovenia's Current Research Information System^ 
At present, Slovenia has 30,630 registered researchers (including young and non-active researchers as well as labo- 
ratory personnel), of which 8,359 have at least one bibliographic unit that is indexed by the Web of Science (WoS). 



'a comprehensive list of publications devoted to the Zipf's law is accessible via: |http://www.nslij-genetics.org/wli/zipy] (by lWentian Li^ 
^The official Web page of Slovenia is accessible via: http://w vyw.slovenia.si/, 
^The SICRIS Web page is accessible via: http://sicris.izum.si/] 
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Currently there are 86,443 publications linked to WoS with a total of 835,970 citations that have accumulated from 
1970 till present. Bibliographies of researchers are updated continuously by a group of specialized libraries that cat- 
alogue new publications as soon as they are collated, while the citation data of all bibliographic units are updated 
monthly via a direct link to WoS. 

Since the SICRIS database is publicly available, we have retrieved full publication records by means of an auto- 
mated information retrieval algorithm, allowing us to keep the statistics as up-to-date as possible. Subsequently, the 
bibliographic records were parsed for citation counts and other measures that are relevant for assessing the scientific 
output of individual researchers. Besides analyzing the data as a whole, we consider separately the University of 
Ljubljana (Slovenia's oldest and largest University) and the "Jozef Stefan" Institute (Slovenia's leading research In- 
stitute), as well as researchers that designated medicine or chemistry as their primary research fields. Since the tables 
are too big to fit here, we made them available online at http://www.matjazperc.com/sicris/stats.html The Web page 
features tables made also for a few other institutions and fields of research, but here we focus on the representative and 
most interesting examples listed above. Note that the tables can be ordered according to diff'erent categories. Some 
trivial Slovenia's most cited researcher to date is Robert Blinc, having 10,891 citations to his name. Slovenia's most 
cited paper, currently having 1,374 citations, is due to Latif et ai, entitled "Identification of the von Hippel-Lindau 
disease tumor suppressor gene", which appeared in Science 260, 1317-1320 (1993). The largest g-index has Uros 
Seljak (92), while the largest h-index has Vito Turk (53). From the 86,443 publications indexed by WoS 22,730 are 
uncited, 23,206 are cited at least 10 times, 729 are cited at least 100 times, while 8 have more than 1,000 citations. 

In what follows, we first examine the distributions of citations to individual papers, whereby we first construct 
Zipf plots of the number of citations versus the A:-th ranked paper. On a double logarithmic scale a usable linear 
fit of the Zipf plot with slope y indicates a power-law distribution of citations P{x) ~ x where a - 1 H- 1/y. 
Likewise, the cumulative distribution of citations Q(x), defined as the probability that a paper has at least x citations, 
is proportional to x^^, where /? = a - I - 1/y. Note that the joint consideration of distributions and cumulative 
distributions, besides the fact that the later alleviates statistical fluctuations, is useful since it helps to pinpoint the 
presence of a power law. Namely if P(x) ~ x " (is a power-law with slope a), then also Q(x) will be a power-law, but 
with the slope a - 1 rather than a. On the other h and, if P(x) ~ e xp'^^'^ (is exponential with slope k) then Q(x) wiU 



also be exponential, but with the same exponent (INewmanl I2003h . Thus, plotting P{x) and Qix) on logarithmic or 



semi-logarithmic scales makes it easy to distinguish power-law from exponential distributions. In a similar fashion, 
we subsequently construct Zipf plots of the g-index and the h-index versus the k-th ranked researcher, as well 
as plot the pertaining cumulative distribution functions Q{g) and Q{h). Unlike for individual publications, the Zipf 
plots have a negative curvature on a double logarithmic scale or can be fitted by a straight line on a semi-log scale, 
which indicates Q(g) ~ exp[-a \ng - b{lng)^] or Q(g) ~ exp{-g/K), respectively. For individual researchers we don't 
consider the classical distributions of the g-index P{g) and the h-index P{h) since the statistical fluctuations are too 
strong, especially for the considered subsets of the whole population. All nonlinear fits presented in this paper have 



been made with the Levenberg-Marquardt method (IPress et al.L Il995h . and the goodness-of-fits has been tested by 



means of the coefficient of determination R^. Since, however, this procedure can yield substantially i naccurate fits. 



we hav e also performed maximum-likelihood fitting and the /?-value test, as advocated in the review bv lClauset et al 



(l2009l) PI Given that Q(g) and Q(h) have equivalent statistical properties, we finally plot the relative ranks (we first 
rank the researchers according to one indicator and subsequently the ordered set of numbers is ranked again according 
to a second indicator) of researchers as determined by the g-index, the h-index, and the total citation count, showing 
that maximal deviations of individual rankings increase with the rank number, but remain uniformly distributed with 
respect to the diagonal throughout the set. Absolute values of the indicators are depicted in support of this as well, in 
turn implying their statistical equivalence, but at the same time strengthening their importance for individual ranking 
within specific groups of researchers. 



3. Results 

We start by presenting Zipf plots of the number of citations x^ versus the k-th ranked paper on a double logarithmic 
scale in the top row of Fig. [1] Results are presented separately for Slovenia (all 86,443 publications; 835,970 citations; 



*Based on publication records retrieved in January 2010. 

'a comprehensive set of methods for fitting power laws accompanying the review is available via: |http://www.santafe.edu/ aaronc/powerlaws/| 
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Figure 1: Top row - Zipf plots of the number of citations xj, versus the k-th ranked paper on a double logarithmic scale. Dashed lines with slope y 
in each panel are data fits depicted for visual reference. The red star by the y value in the middle panel indicates that for the Institute "Jozef Stefan" 
the fit applies to a considerably narrower region than in the other panels. Bottom row - Citation distributions P{x) (gray o) and cumulative citation 
distributions Q{x) (black A) obtained from the number of citations x to individual publications. Dashed gray and dotted black lines with slopes 
a = 1 + l/y and /} = l/y, respectively, where y is taken from the corresponding top panels, are depicted for visual reference. Fitting the depicted 
cumulative citation distributions direcdy yields (from left to right): yS = 1.70(l),Xn„„ = 22,R^ = 0.999; /3 = 1.92(l),Xn„„ = 25,i?- = 0.999; 
13= 1.75(2), ;cniin = 26,fi^ = 0.996; jS = 1.36(1), JCmin = 13,R2 = 0.997; yS = 2.06(2), jc^m = 18,i?^ = 0.997, where is the lower bound of the 
power-law behavior iClauset et aill2007h and is the coefficient of determination. In the middle panel the p- value is lower than 0.1, thus making 
Q(x) ~ x^^ a questionable model for the data. Numbers in parentheses give the en'or on the last figure. 



9.67 per paper), for the University of Ljubljana (subset of 30,767 publications; 263,958 citations; 8.58 per paper), for 
the "Jozef Stefan" Institute (subset of 17,425 publications; 230,700 citations; 13.24 per paper), as well as for medicine 
(subset of 19,220 publications; 195,119 citations; 10.15 per paper) and chemistry (subset of 11,370 publications; 
126,055 citations; 11.09 per paper) as two representative fields of research. Apart from deviations at low and high 
values of k, it is possible to fit a straight line reasonably well to the plots with the least-squares fit yielding the 
exponents 7 as depicted in the corresponding panels. Notably, for the "Jozef Stefan" Institute the Zipf plot has a 
slight negative radius across the whole span of k, thus making the appropriateness of the linear fit debatable (marked 
with the red star). In any case, the "Jozef Stefan" Institute is special in that its publications have a comparatively 
high average of citations per paper (13.24 compared to the national average of 8.58), and that in the past it had a 
rather strict hierarchical constitution. Depending on the considered set of publications, y ranges from 0.47 - 0.71, 
which theoretically corresponds to power-law distributions P(x) ~ x^" with a between 2.41 - 3.13, or equivalently to 
cumulative power-law distributions Q{x) ~ x"^ with between 1 .41 - 2.13. 

The bottom row of Fig. [T] features P(x) (gray o) and Q{x) (black A) of the corresponding Zipf plots from the 
top row. It can be observed that the Zipf plots translate fairly accurately to their expected power-law cumulative 
distributions Q{x) ~ x'^, with Levenberg-Marquardt fits of the large-jc values, i.e. x > Xmin, delivering exponents 
in agreement with j8 as I/y (see the caption of Fig. [T]for details). Moreover, the corresponding distributions P(x) 
also show power-law properties in that P{x) ~ jc " on a double logarithmic scale, with a ~ /3 + I. Altogether, 
these results are in good agreement with those presented earlier by Redner (1998), where also the distribution of 
citations to individual publications that were catalogued by the Institute for Scientific Information and 20 years of 
publications in the Physical Review D were found to have a large-x power law decay P{x) ~ x " with a x 3. Here 
we show that these observations are fairly robust to variations in research fields and institutions, and can indeed be 
observed for a nation as a whole. Moreover, the prevalence of the Zipf law in citations to individual publications 
across different research fields and institutions directly implies that the mechanisms underlying this phenomenon are 
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Figure 2: Top row - Zipf plots of the g-index gi, (solid black) and h-index hi, (dashed gray) versus the k-th ranked researcher on a double logarithmic 
or semi-log (middle and rightmost panel) scale. For comparisons, it is useful to define a scaled <:-th ranked g-index and h-index by (g) and (h), 
respectively, where (■) indicates average over the corresponding researcher population. Bottom row - Cumulative g-index Q{g) (black o) and h- 
index Qih) (gray A) distributions obtained from the corresponding researcher population. For comparisons, the h-index on the horizontal axis was 
rescaled (h —> h") to fit to the interval of the g-index. Green dashed fines indicate log-normal fits of the form Q{g) ~ exp[— a Ing - h(ln g)-], where 
the values of a and b are depicted in each panel. Where applicable, red dashed lines indicate stretched exponential fits of the form Q(g) ~ exp(—g^), 
where the values of S are depicted in each panel. In the middle and rightmost panel, however, the distribution is not log-normal but exponential, such 
that Q(g) ~ exp(— §/*■), where k « 14(1) and k ~ 8.7(3), respectively. Numbers in parentheses give the error on the last figure. The goodness-of-fit 
as determined via R- is beyond 0.99 in all cases, except for the stretched exponential fits where it equals 0.97. 



robust as well. The cumulative advantage dde SoUa Pricelll965 , Il976h of highly cited papers thus works irrespective of 
particularities that can be associated with individual publications. On the other hand, it is also known that considering 
individual re searchers as points of referen ce rather than individual publications can lead to rather different results. 
In particular. iLaherrere and Sornette (Il998h reported the occurrence of stretched exponentials rather than power laws 
when examining the distributions of citations of most cited physicists. We therefore perform a similar statistical 
analysis as presented in Fig.[T]also for individual researchers. 

Zipf plots of the g-index (solid black) and h-index hk (dashed gray) versus the ^-th ranked researcher on a double 
logarithmic or semi-log scale (depending on the considered set of researchers) are presented in the top panel of Fig.[2| 
As above, results are presented separately for Slovenia (all 8,359 researchers), for the University of Ljubljana (subset 
of 2,377 researchers), for the "Jozef Stefan" Institute (subset of 501 researchers), as well as for medicine (subset 
of 1,684 researchers) and chemistry (subset of 588 researchers). By comparing these results to those presented in 
the top row of Fig. [1] it becomes clear that in case of individual researchers power laws are no longer possible to 
advocate. The curves either have a negative radius across the whole set of gk and hj, values, or can be fitted by a 
straight line on a semi-log scale (middle and rightmost panel). Furthermore, it is remarkable to observe that the g- 
index and the h-index (as well as the total citation count; not shown) have equivalent statistical properties in terms of 
their Zipf plots as well as the corresponding cumulative distributions Q(g) and Q(h), which are shown in the bottom 
row of Fig.|2| We find that the best fits to the cumulative distributions are obtained either by means of a log-normal 
Q(g) ~ exp[-alng - b{lng)^] or an exponential Q(g) ~ exp{-g/K) function, where the values of a, b and k (where 
applicable) are depicted in the corresponding panels. Notably, the departure from the log-normal to the exponential 
distribution can be observed for the "Jozef Stefan" Institute (middle panel) and for the research field of chemistry 
(rightmost panel). Although it is difficult to pinpoint exactly why this happens, some clues can be gathered from the 
self-citation rates. The national average is 0.19, meaning that 160,725 from the total of 835,970 citations are self- 
citations. The University of Ljubljana has 0.22 (59,988 out of 263,958), the "Jozef Stefan" Institute has 0.20 (46,940 
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Figure 3: Top row - Comparison of researcher rankings based on different indicators of their scientific output. Researcher ai'e first ranked according 
to one indicator Subsequently, the obtained ordered set k is reordered according to the ranking of researchers based on a second indicator, thus 
yielding the relative rank Ur. Plotting k versus k^ shows to what extend the ranking via the two considered indicators differs. If all point would fall 
on the diagonal (depicted dashed green for visual reference), this would imply that the two indicators yield an identical ranking of the considered set 
of researchers. Compared pairs of indicator are (from left to right): total number of citations versus the g-index, total number of citations versus the 
h-index, and g-index versus the h-index. Bottom row - Comparisons of absolute values of the indicators, corresponding to the pairs considered in 
the top panels. A double logarithmic scale is used because of the substantially different maximal values of the compared indicators. Note also that 
the top-seeded researchers in this representation are positioned top light rather than bottom left. In all the panels top 500 researchers are displayed. 



out of 230,700), medicine has 0.13 (26,284 out of 199,947) while chemistry has 0.31 (38,659 out of 124705). From 
these values it can be concluded that fields of research with a relatively high self-citation rate, such as chemistry in our 
case, are more likely to yield exponential distributions of scientific output related to individual researchers. Regarding 
the "Jozef Stefan" Institute, which also features an exponential Q{g), we have already noted its past rather strict 
hierarchical constitution, which may have adversely affected the ranking of subordinate individuals (or promoted the 
ranking of superior individuals). It is worth noting that the log-normal form applied in the bottom row of Fig.|2](green 
dashed line) can in our case be replaced fairly well als o by a stretched exponential Q{g) ~ exp(-g^) (red dashed line), 
which was reported by iLaherrere and Sornettd d 19981) . thus making our results essentially in agreement with earlier 
works and extending their validity beyond specific fields of research as well as institutions. Lastly regar ding the results 
presented in Fig. [2] it is interesting to note that log-normal distributions were reported recently also bv lRednen (l2005h 
for the citation data of 110 years of the Physical Review. Although there individual papers were taken as points of 
reference, and one could therefore expect the prevalence of power-law distributions in accordance with earlier works 
(lRedneii[l998l) and our Fig.[Tl the fact that only internal citations (i.e. citations from Physical Review articles to other 
Physical Review articles) were considered might have been a factor contributing to the deviation. 

With respect to the statistical equality of distributions of the g-index and the h-index (as well as the total citation 
count; not shown) it is instructive to examine relative rankings of pairs of different indicators. First ordering the 
researchers by rank according to their total citation count, i.e. their total number of citations, and then ranking again 
the ordered set of numbers according to the g-index, yields how (and in which direction) the ranking of an individual 
differs when evaluated via the total citation count or via the g-index. This can be made for different combinations 
of scientific output indicators, as presented in the top row of Fig. [3] for the top 500 researchers of Slovenia. It can 
be observed that differences in ranking are indeed present, but they seem equally probable in both directions for any 
given A; - it is not as if a given indicator would systematically downgrade only those with low k, for example. It is 
also interesting to note that the deviations from the diagonal become larger with increasing k, which indicates that 
lower-ranking researchers are more likely to be rated differently by different measures, while high-ranking researchers 
will remain top-seeded irrespective of which indicator is used. Importantly, however, this observation is not entirely 
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surprising because, as we move towards the lower rankings, more and more researchers will have the same indicator 
so that small absolute changes of the indicator are more likely to lead to large changes in the rank. We therefore 
show in the bottom row of Fig. [3] the pertaining comparisons of absolute values of the different indicators for the 
top 500 researchers, which however, confirm to a large extend that the ranking via different indicators is more likely 
to deviate for lo wer-ra nking than for the top-seeded researchers. Given the definitions of the g-index fegghe , 20061) 
and the h-index jHirsch. .2005.) , as well as their relatedness to the total citation count, these results are not surprising 
and confirm the consistency and logical connectedness of these measures. At the same time, they provide some 
justification as to why the distributions of the g-index and the h-index are practically equivalent (see Fig.|2|, but also 
point out the fact that the properties of citation records of each individual are crucial for its ranking with i n a given 
group. Different indexes and nieasures of scientific output iHirschL 2007 ; Iglesias and Pecharromani 2007 ; Jin et al. , 
2:0071 : Sidiropoulos et al. , 2007 ; Rousseau and Ye , 2008 ; Bar-Ilanl 2008h are therefore extremely useful and indeed 



much needed to properly evaluate the quality and impact of individual researchers. 



4. Summary 

In sum, we have shown that the distributions of citations per publication for different institutions and research 
fields, as well as Slovenia as a whole, have Zipfian properties in that they can be fitted fairly accurately by a power 
law. On the other hand, taking into account individual researchers rather than publications, we have shown that the cu- 
mulative distributions of Egghe's g-index and Hirsch's h-index are consistent with a log-normal, or in case of research 
fields with high self-citation rates or organizations with a special constitution, an exponential form. Interestingly, the 
distributions of the two indexes are statistically equivalent, thus implying their consistency and logical connectedness, 
but at the same time also strengthening their importance for properly assessing the scientific output of individual re- 
searchers. As a cautionary note with respect to the national selection criteria currently employed by the Slovenian 
Research Agency (ARRq3), we note that a favorable bias in ranking emerges d ue to not ta king into account the num- 
ber of co-auth ors when evaluating the citation data of individual researchers dWan et al. I 2007; Schreiber, 2008l33; 



lEgghel l2008bh . Consequently, researchers that are members of collaboration networks involved in Particle Physics 
research (e.g. DELPHI, Belle or HERA-B) dominate the rankings. We hope the study will be useful for deriving the- 
oretical models (Egghe, 2009) explaining the emergence of empirically observed distributions and for drawing further 
attention to this interesting topic. 
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